AITopics | weight uncertainty

Collaborating Authors

weight uncertainty

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

781877bda0783aac5f1cf765c128b437-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 07:23:42 GMT

dropout, dun, posterior, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Uncertainty-Aware Decoding with Minimum Bayes Risk

Daheim, Nico, Meister, Clara, Möllenhoff, Thomas, Gurevych, Iryna

arXiv.org Artificial IntelligenceMar-7-2025

Despite their outstanding performance in the majority of scenarios, contemporary language models still occasionally generate undesirable outputs, for example, hallucinated text. While such behaviors have previously been linked to uncertainty, there is a notable lack of methods that actively consider uncertainty during text generation. In this work, we show how Minimum Bayes Risk (MBR) decoding, which selects model generations according to an expected risk, can be generalized into a principled uncertainty-aware decoding method. In short, we account for model uncertainty during decoding by incorporating a posterior over model parameters into MBR's computation of expected risk. We show that this modified expected risk is useful for both choosing outputs and deciding when to abstain from generation and can provide improvements without incurring overhead. We benchmark different methods for learning posteriors and show that performance improves with prediction diversity. We release our code publicly.

aclanthology, computational linguistic, posterior, (14 more...)

arXiv.org Artificial Intelligence

2503.05318

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
(28 more...)

Genre: Research Report (1.00)

Add feedback

A local squared Wasserstein-2 method for efficient reconstruction of models with uncertainty

Xia, Mingtao, Shen, Qijing

arXiv.org Machine LearningJun-10-2024

In this paper, we propose a local squared Wasserstein-2 (W_2) method to solve the inverse problem of reconstructing models with uncertain latent variables or parameters. A key advantage of our approach is that it does not require prior information on the distribution of the latent variables or parameters in the underlying models. Instead, our method can efficiently reconstruct the distributions of the output associated with different inputs based on empirical distributions of observation data. We demonstrate the effectiveness of our proposed method across several uncertainty quantification (UQ) tasks, including linear regression with coefficient uncertainty, training neural networks with weight uncertainty, and reconstructing ordinary differential equations (ODEs) with a latent random variable.

neural network model, standard deviation, weight uncertainty, (12 more...)

arXiv.org Machine Learning

2406.06825

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts

Ren, Qihan, Deng, Huiqi, Chen, Yunuo, Lou, Siyu, Zhang, Quanshi

arXiv.org Artificial IntelligenceDec-1-2023

In this paper, we focus on mean-field variational Bayesian Neural Networks (BNNs) and explore the representation capacity of such BNNs by investigating which types of concepts are less likely to be encoded by the BNN. It has been observed and studied that a relatively small set of interactive concepts usually emerge in the knowledge representation of a sufficiently-trained neural network, and such concepts can faithfully explain the network output. Based on this, our study proves that compared to standard deep neural networks (DNNs), it is less likely for BNNs to encode complex concepts. Experiments verify our theoretical proofs. Note that the tendency to encode less complex concepts does not necessarily imply weak representation power, considering that complex concepts exhibit low generalization power and high adversarial vulnerability. The code is available at https://github.com/sjtu-xai-lab/BNN-concepts.

bnn, interactive concept, neural network, (14 more...)

arXiv.org Artificial Intelligence

2302.13095

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data-Driven Probabilistic Energy Consumption Estimation for Battery Electric Vehicles with Model Uncertainty

Maity, Ayan, Sarkar, Sudeshna

arXiv.org Artificial IntelligenceJul-2-2023

This paper presents a novel probabilistic data-driven approach to trip-level energy consumption estimation of battery electric vehicles (BEVs). As there are very few electric vehicle (EV) charging stations, EV trip energy consumption estimation can make EV routing and charging planning easier for drivers. In this research article, we propose a new driver behaviour-centric EV energy consumption estimation model using probabilistic neural networks with model uncertainty. By incorporating model uncertainty into neural networks, we have created an ensemble of neural networks using Monte Carlo approximation. Our method comprehensively considers various vehicle dynamics, driver behaviour and environmental factors to estimate EV energy consumption for a given trip. We propose relative positive acceleration (RPA), average acceleration and average deceleration as driver behaviour factors in EV energy consumption estimation and this paper shows that the use of these driver behaviour features improves the accuracy of the EV energy consumption model significantly. Instead of predicting a single-point estimate for EV trip energy consumption, this proposed method predicts a probability distribution for the EV trip energy consumption. The experimental results of our approach show that our proposed probabilistic neural network with weight uncertainty achieves a mean absolute percentage error of 9.3% and outperforms other existing EV energy consumption models in terms of accuracy.

artificial intelligence, energy consumption, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2307.00469

Country:

Asia > India > West Bengal > Kharagpur (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

DBSN: Measuring Uncertainty through Bayesian Learning of Deep Neural Network Structures

Deng, Zhijie, Luo, Yucen, Zhu, Jun, Zhang, Bo

arXiv.org Machine LearningNov-21-2019

Bayesian neural networks (BNNs) introduce uncertainty estimation to deep networks by performing Bayesian inference on network weights. However, such models bring the challenges of inference, and further BNNs with weight uncertainty rarely achieve superior performance to standard models. In this paper, we investigate a new line of Bayesian deep learning by performing Bayesian reasoning on the structure of deep neural networks. Drawing inspiration from the neural architecture search, we define the network structure as gating weights on the redundant operations between computational nodes, and apply stochastic variational inference techniques to learn the structure distributions of networks. Empirically, the proposed method substantially surpasses the advanced deep neural networks across a range of classification and segmentation tasks. More importantly, our approach also preserves benefits of Bayesian principles, producing improved uncertainty estimation than the strong baselines including MC dropout and variational BNNs algorithms (e.g. noisy EK-FAC).

dbsn, network structure, neural network, (15 more...)

arXiv.org Machine Learning

1911.09804

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.64)

Add feedback

Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Generative Modeling

Li, Jiachen, Ma, Hengbo, Zhan, Wei, Tomizuka, Masayoshi

arXiv.org Artificial IntelligenceMay-2-2019

Coordination recognition and subtle pattern prediction of future trajectories play a significant role when modeling interactive behaviors of multiple agents. Due to the essential property of uncertainty in the future evolution, deterministic predictors are not sufficiently safe and robust. In order to tackle the task of probabilistic prediction for multiple, interactive entities, we propose a coordination and trajectory prediction system (CTPS), which has a hierarchical structure including a macro-level coordination recognition module and a micro-level subtle pattern prediction module which solves a probabilistic generation task. We illustrate two types of representation of the coordination variable: categorized and real-valued, and compare their effects and advantages based on empirical studies. We also bring the ideas of Bayesian deep learning into deep generative models to generate diversified prediction hypotheses. The proposed system is tested on multiple driving datasets in various traffic scenarios, which achieves better performance than baseline approaches in terms of a set of evaluation metrics. The results also show that using categorized coordination can better capture multi-modality and generate more diversified samples than the real-valued coordination, while the latter can generate prediction hypotheses with smaller errors with a sacrifice of sample diversity. Moreover, employing neural networks with weight uncertainty is able to generate samples with larger variance and diversity.

coordination, latexit latexitsha1, th 1, (16 more...)

arXiv.org Artificial Intelligence

1905.00587

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.47)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Scalable Robust Kidney Exchange

McElfresh, Duncan C, Bidkhori, Hoda, Dickerson, John P

arXiv.org Artificial IntelligenceNov-8-2018

In barter exchanges, participants directly trade their endowed goods in a constrained economic setting without money. Transactions in barter exchanges are often facilitated via a central clearinghouse that must match participants even in the face of uncertainty---over participants, existence and quality of potential trades, and so on. Leveraging robust combinatorial optimization techniques, we address uncertainty in kidney exchange, a real-world barter market where patients swap (in)compatible paired donors. We provide two scalable robust methods to handle two distinct types of uncertainty in kidney exchange---over the quality and the existence of a potential match. The latter case directly addresses a weakness in all stochastic-optimization-based methods to the kidney exchange clearing problem, which all necessarily require explicit estimates of the probability of a transaction existing---a still-unsolved problem in this nascent market. We also propose a novel, scalable kidney exchange formulation that eliminates the need for an exponential-time constraint generation process in competing formulations, maintains provable optimality, and serves as a subsolver for our robust approach. For each type of uncertainty we demonstrate the benefits of robustness on real data from a large, fielded kidney exchange in the United States. We conclude by drawing parallels between robustness and notions of fairness in the kidney exchange setting.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

1811.03532

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)

Add feedback

Weight Uncertainty in Neural Networks

Blundell, Charles, Cornebise, Julien, Kavukcuoglu, Koray, Wierstra, Daan

arXiv.org Machine LearningMay-21-2015

We introduce a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop. It regularises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood. We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classification. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.

deep learning, neural network, upstream oil & gas, (15 more...)

arXiv.org Machine Learning

1505.05424

Country:

Europe > France (0.28)
Asia (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.34)

Add feedback

Exact Convex Confidence-Weighted Learning

Crammer, Koby, Dredze, Mark, Pereira, Fernando

Neural Information Processing SystemsDec-31-2009

Confidence-weighted (CW) learning [6], an online learning method for linear classifiers, maintains a Gaussian distributions over weight vectors, with a covariance matrix that represents uncertainty about weights and correlations. Confidence constraints ensure that a weight vector drawn from the hypothesis distribution correctly classifies examples with a specified probability. Within this framework, we derive a new convex form of the constraint and analyze it in the mistake bound model. Empirical evaluation with both synthetic and text data shows our version of CW learning achieves lower cumulative and out-of-sample errors than commonly used first-order and second-order online methods.

algorithm, constraint, weight vector, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback